# Efficient Multimodal
Nanovlm 222M
Apache-2.0
nanoVLM is an ultra-minimalist lightweight vision-language model (VLM) designed for efficient training and experimentation.
Image-to-Text
N
lusxvr
2,441
73
Llava Mini Llama 3.1 8b
Gpl-3.0
LLaVA-Mini is an efficient multimodal large model that significantly improves the efficiency of image and video understanding by using only 1 visual token to represent an image.
Image-to-Text
L
ICTNLP
12.45k
51
Featured Recommended AI Models